Language Technology
DHRITI – Malayalm OCR
DHRITI is a web-based OCR solution developed by ICFOSS to help the users in extraction and enabling them to searching concents in a malayalam document . The system also provides an option for editing the texts in the extracted field using the DHRITI web platform.
The project aims at creating an efficient searching and information extraction system that can process a large set of Malayalam documents and produce the most accurate results based on a user query. The Main features of the solution are:
• Convert printed documents into text files with the help of Optical Character Recognition software.
• Searching and finding contents within in the documents
• Extraction of information from the printed documents
OCR Data Analytics:
ICFOSS has also customises the DHRITI OCR solution so as to integrate the same with the Chief Ministers Public Grievance Redress Portal for data analytics and data extraction. The solution helps in extracting and analysing information from the complaints raised the general public.
The solution helps the CMO Cell / departments to extact data from the petitions received either from the general public or from different departments of government of Kerala, in the fold of pictures, PDF etc.
The OCR data analyser helps:
• Find names of people, organizations and places from the document.
• Extracting From, To, Subject from the petitions
• Do similarity check on user queries with the named entities.
• Search process results in most efficient results.
• Editing, autofill and correction of converted petitons.
Gitlab Repository
Express Interest